Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
The objective of this research is to enable safety‐critical systems to simultaneously learn and execute optimal control policies in a safe manner to achieve complex autonomy. Learning optimal policies via trial and error, that is, traditional reinforcement learning, is difficult to implement in safety‐critical systems, particularly when task restarts are unavailable. Safe model‐based reinforcement learning techniques based on a barrier transformation have recently been developed to address this problem. However, these methods rely on full‐state feedback, limiting their usability in a real‐world environment. In this work, an output‐feedback safe model‐based reinforcement learning technique based on a novel barrier‐aware dynamic state estimator has been designed to address this issue. The developed approach facilitates simultaneous learning and execution of safe control policies for safety‐critical linear systems. Simulation results indicate that barrier transformation is an effective approach to achieve online reinforcement learning in safety‐critical systems using output feedback.more » « less
-
We introduced a working memory augmented adaptive controller in our recent work. The controller uses attention to read from and write to the working memory. Attention allows the controller to read specific information that is relevant and update its working memory with information based on its relevance, similar to how humans pick relevant information from the enormous amount of information that is received through various senses. The retrieved information is used to modify the final control input computed by the controller. We showed that this modification speeds up learning.In the above work, we used a soft-attention mechanism for the adaptive controller. Controllers that use soft attention update and read information from all memory locations at all the times, the extent of which is determined by their relevance. But, for the same reason, the information stored in the memory can be lost. In contrast, hard attention updates and reads from only one location at any point of time, which allows the memory to retain information stored in other locations. The downside is that the controller can fail to shift attention when the information in the current location becomes less relevant.We propose an attention mechanism that comprises of (i) a hard attention mechanism and additionally (ii) an attention reallocation mechanism. The attention reallocation enables the controller to reallocate attention to a different location when the relevance of the location it is reading from diminishes. The reallocation also ensures that the information stored in the memory before the shift in attention is retained which can be lost in both soft and hard attention mechanisms. Through detailed simulations of various scenarios for two link robot robot arm systems we illustrate the effectiveness of the proposed attention mechanism.more » « less
An official website of the United States government
